British Journal of Ophthalmology — Latest Matching Preprints

1

Incidence and Predictors of IOP-Lowering Treatment Following Detection of Referable Glaucoma in a Teleretinal Screening Program

Bolo, K.; Wong, B.; Do, J.; Ambite, J.-L.; Li, Z.; Kesselman, C.; Daskivich, L.; Xu, B.

2026-06-04 ophthalmology 10.64898/2026.06.02.26354782 medRxiv

Top 0.1%

14.7%

Show abstract

Purpose: To evaluate the incidence and baseline predictors of intraocular pressure (IOP)-lowering treatment following detection of referable glaucoma by teleretinal screening. Design: Retrospective cohort study. Methods: Participants were derived from a safety-net teleretinal diabetic retinopathy screening program (2013-2024). Participants included individuals who screened positive for referable glaucoma (cup-to-disc ratio [CDR] [≥]0.6 or CDR asymmetry [≥]0.2) and completed in-office diagnostic evaluation. The primary outcome was initiation of IOP-lowering treatment (medication, laser, or surgery) and the secondary outcome was intervention with surgery. Cumulative incidence functions were estimated, accounting for loss to follow-up. Fine-Gray models were used to identify baseline screening predictors to risk stratify each outcome. Glaucoma diagnosis was approximated using diagnostic codes and chart review. Results: 2,367 participants were included. The cumulative incidence of treatment was 19.6% (95% CI: 18.0-21.2) at Year 1 and 45.1% (42.1-48.1) at Year 8. Early treatment occurred primarily in glaucoma cases, whereas treatment accumulated longitudinally in glaucoma suspects, reaching 36.5% (31.6-41.5) by Year 8. Surgery was less common (8-year incidence: 5.3%). Baseline screening data predicted treatment and surgery, enabling risk stratification. At Year 8, cumulative incidence differed substantially between high- and low-risk groups (treatment: 59.9% vs. 31.2%; surgery: 9.7% vs. 1.0%). Older age (sub-distribution hazard ratio [SHR] 1.03 per year, p<0.001), Black race (SHR 1.50, p<0.001), and personal history of glaucoma (SHR 1.90, p<0.001) were associated with treatment; Asian race was protective (0.71, p=0.03). Older age (SHR 1.06, p<0.001), worse visual acuity (SHR 5.11 per logMAR unit, p<0.001), and screening at a hospital-based site (SHR 2.46, p=0.003) were associated with surgical treatment. Conclusion: Nearly half of safety-net diabetic patients screening positive for referable glaucoma initiated IOP-lowering treatment over 8 years, while few received surgery. Baseline screening characteristics enabled risk stratification of treatment and surgery. These findings address an evidence gap about longitudinal consequences of screening and suggest that its impact extends beyond detection of prevalent glaucoma to include identification of high-risk glaucoma suspects who warrant ongoing surveillance.

2

Design and Validation of an AI-Assisted Sequential Screening Framework for Psychological Distress in Glaucoma

Chou, N. A.; Baek, Y.; Feng, F.; Lu, K.; Choi, E. Y.; Fisher, H. M.; Malek, D.; Jammal, A.; Somers, T. J.; Muir, K. W.; Medeiros, F. A.; Berchuck, S. I.

2026-05-22 ophthalmology 10.64898/2026.05.20.26353679 medRxiv

Top 0.1%

10.1%

Show abstract

Purpose: Psychological distress is highly prevalent in glaucoma and is associated with worse adherence, reduced quality of life, and faster disease progression. However, distress is rarely assessed in ophthalmology settings due to time, workflow, and staffing constraints. We evaluated two artificial intelligence (AI)-based screening strategies, designed to efficiently identify distressed primary open angle glaucoma (POAG) patients during routine care, aiming to achieve effective, resource conscious, low burden clinical screening. Design: Hybrid retrospective cohort and prospective cross-sectional study. Participants: The retrospective cohort included >3,000 POAG patients from the Duke Ophthalmic Registry. Prospective validation was conducted in a separate 300 POAG patient cohort who completed patient-reported distress screening. Methods: Using retrospective data, a neural network model was trained to predict an electronic health record (EHR)-derived computable phenotype of distress ("silver standard"). Prospective validation used the 8-item Patient Health Questionnaire (PHQ-8) as the "gold standard." Three screening strategies were compared against PHQ-8: (1) universal PHQ-2 screening (two-item screener administered to all patients), (2) AI-only screening (fully automated EHR-based screener), and (3) sequential screening, (only patients flagged as high risk by AI screener completed the PHQ-2). Performance metrics included sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), accuracy, and screening burden. Main Outcome Measures: Sensitivity; specificity; PPV; NPV; accuracy; proportion of patients requiring secondary screening (screening burden). Results: Distress prevalence was 17% (PHQ-8 > 6). Universal PHQ-2 screening (> 0) achieved high sensitivity (0.96) but lower specificity (0.73) and PPV (0.41), while requiring screening of all patients. The AI-assisted sequential approach substantially reduced screening burden while maintaining strong diagnostic performance. By administering PHQ-2 to ~25% of patients, sequential screening achieved sensitivity 0.64, specificity 0.93, PPV 0.64, NPV 0.93, and accuracy 0.88, representing a ~50% increase in PPV compared to PHQ-2 alone. AI-only screening reduced burden further but did not achieve comparable sensitivity or predictive performance. Conclusions: AI-assisted sequential screening enables scalable, resource efficient identification of psychological distress in glaucoma care, substantially reducing screening burden while preserving clinically meaningful performance. This framework offers a practical pathway for integrating distress screening into routine ophthalmology workflows and improving the identification and referral of at-risk patients.

3

Small palpebral fissure as a significant risk factor for glaucoma surgery failure

Okuzumi, N.; Mori, S.; Katakami, K.; Iwaki, Y.; Sakamoto, M.; Yamada, Y.; Nakamura, M.

2026-05-28 ophthalmology 10.64898/2026.05.27.26354208 medRxiv

Top 0.1%

8.6%

Show abstract

Purpose: To evaluate the impact of ''not commonly considered risk factors '' on glaucoma surgical outcomes. Methods: This study included 339 eyes that underwent glaucoma surgery. Surgical procedures included microhook ab-interno trabeculotomy (TLO), Preserflo ab-externo microshunt implantation, trabeculectomy (Trab), and Ahmed Glaucoma Valve (AGV) implantation. In addition to conventional background factors, we examined a set of ''not commonly considered risk factors, '' including very elderly age ([≥]85 years), avitreous status, aphakia, use of antithrombotic agents, difficulty attending frequent postoperative visits, small palpebral fissure, corneal endothelial dysfunction, poor vision in the fellow eye, dementia, hearing loss, mental illness, atopic dermatitis, pseudophacodonesis, glaucoma eye drop allergy, and conditions contraindicating {beta}-blocker use. Surgical success was defined as intraocular pressure (IOP) [≤]21 mmHg, [≥]20% reduction from baseline, and no additional glaucoma surgery at 1 year. Logistic regression was performed to identify potential risk factors; significant factors were further evaluated using propensity score matching. Results: Of the 339 cases, surgical success rates were 65% for TLO, 82% for Preserflo, 91% for Trab, and 82% for AGV. Multivariate logistic regression identified two independent predictors of surgical failure: small palpebral fissure (odds ratio 2.52, p < 0.01) and hearing loss (odds ratio 3.94, p = 0.04). Propensity score matching of patients with small versus large palpebral fissures (111 per group) confirmed significantly worse postoperative outcomes in the small-palpebral-fissure group despite balanced baseline characteristics. Conclusion: Small palpebral fissure is an independent and previously unnoticed risk factor for glaucoma surgical failure, affecting both minimally invasive and filtration procedures.

4

Developing and Evaluating Deep Learning Approaches for Visual Field Denoising in Glaucoma

Baek, J. S.; Lokhande, A.; Neuenschwander, D.; Shi, M.; Wang, M.

2026-06-01 ophthalmology 10.64898/2026.05.29.26354019 medRxiv

Top 0.1%

8.5%

Show abstract

Purpose To investigate the relative efficacy of nine distinct visual field (VF) denoising artificial intelligence (AI) methods and a pathology-aware AI strategy to discourage over-correction of glaucomatous defects. Design Retrospective study. Participants 87,940 paired visual field (VF) and optical coherence tomography (OCT) samples from a tertiary academic center. Methods Denoising models were trained on a separate VF-only dataset and evaluated on an independent structure-function dataset of paired VF-OCT samples. We implemented and evaluated nine distinct VF denoising strategies representing three broad categories: baseline measurements, self-supervised and image restoration models (including Noise2Noise, Noise2Void, and NAFNet), and latent variable compression-based models (autoencoders and variational autoencoders). All models were designed to reconstruct VF sensitivity maps. We then predicted retinal nerve fiber layer thickness (RNFLT) maps from the denoised VFs using a fixed, independently trained VF-to-RNFLT prediction model. Main Outcome Measures Predicted VF and RNFLT maps and resultant evaluation metrics. Results The raw VF baseline achieved a global R2 of 0.5468 and MAE of 16.83 um. Restoration-based models maintained or slightly improved concordance, with the pathology-aware NAFNet achieving the highest global R2 of 0.5485 and a comparable MAE of 16.82 um. In contrast, compression-based models degraded concordance, with CNN-VAE showing a significant reduction (R2 approximately 0.50). In severe glaucoma, concordance decreased across all methods; however, compression architectures exhibited disproportionately greater degradation compared with restoration-based approaches. Conclusions We present a comparative benchmark of AI-based VF denoising strategies paired with structure-function evaluation. While restoration-based models can reduce variability without loss of biological signal, latent compression risks attenuating clinically meaningful defects. Visually smoother fields are not necessarily more biologically accurate.

5

Neovascular Glaucoma at a Tertiary Centre in Finland, 2008-2024: A Retrospective Cohort Study

Simons, G.; von Fersen, M.; Summanen, P.; Harju, M.

2026-06-02 ophthalmology 10.64898/2026.06.01.26354330 medRxiv

Top 0.1%

7.1%

Show abstract

Background/Aims: Neovascular glaucoma (NVG) is an aggressive secondary glaucoma with limited longitudinal data. This study reports the aetiologies, treatments, and longitudinal outcomes in NVG. Methods: Patients with NVG were identified through electronic medical record review. Inclusion required documented rubeosis of the iris and/or anterior chamber angle, intraocular pressure (IOP) [≥]25 mmHg, diagnosis during 2008-2024, and follow-up at Helsinki University Hospital. Baseline data and all follow-up visits were included. Results: Of 919 patients identified, 626 met inclusion criteria, with a median follow-up of 24 months. The estimated NVG incidence was 2.2/100,000/year. The most common aetiology was central retinal vein occlusion (CRVO; 45%), followed by diabetic retinopathy (DR; 14%), central retinal artery occlusion (CRAO; 11%), and ocular ischaemic syndrome (8%). Half of patients had hand motion vision or worse at baseline, with 18% at no light perception (NLP). At 5 years, 13% of patients had Snellen 6/60 vision or better. Visual outcomes differed by aetiology, with median time to NLP ranging from 1.6 (CRAO) to 9.1 (DR) years (log-rank p=0.002). Median baseline IOP was 40 mmHg, decreasing to 21 mmHg by 1 year. Ocular pain fell from 43% at baseline to 11% at last follow-up. Structural eye loss (e.g., enucleation or phthisis) occurred in 3% by 5 years. Conclusion: The estimated incidence was lower than previously reported elsewhere. Unlike other cohorts where DR predominates, CRVO was the most common aetiology, and visual prognosis was strongly aetiology-dependent. Glaucoma drainage device surgery reached 7.6% at 3 years, despite the severity and refractory nature of NVG.

6

Deriving OCT-Equivalent Retinal Nerve Fiber Layer Thickness Maps from Fundus Photographs with Deep Learning Improves Glaucoma Diagnosis

Shi, L.; Shi, M.; Chung, I. Y.; Pasquale, L. R.; Shen, L. Q.; Wang, M.

2026-05-27 ophthalmology 10.64898/2026.05.26.26354047 medRxiv

Top 0.1%

4.9%

Show abstract

Purpose: To develop and evaluate a deep learning model that predicts optical coherence tomography (OCT)-equivalent retinal nerve fiber layer thickness (RNFLT) maps directly from color fundus photographs and to assess their diagnostic value for glaucoma detection. Design: Retrospective model development and evaluation study. Participants: 15,031 paired fundus photographs and spectral-domain OCT scans collected at Massachusetts Eye and Ear between 2011 and 2022. Methods: Paired fundus and OCT images were used to train a U-Net-based model to predict pixel-wise RNFLT maps with artifact-corrected supervision. Diagnostic performance was evaluated across single-modality models (fundus photos only, real RNFLT maps, predicted RNFLT maps) and multimodal fusion models (fundus + predicted RNFLT maps). Stratified analyses examined model performance across glaucoma severity and demographic subgroups. Glaucoma was defined based on standard criteria applied to Humphrey 24-2 visual field testing. Main Outcome Measures: Mean absolute error (MAE) and structural similarity index (SSIM) for RNFLT map prediction. Area under the ROC curve (AUC) and accuracy for glaucoma detection. Results: RNFLT map prediction achieved a MAE = 15.4 m and a SSIM = 0.65, measured against artifact-corrected RNFLT maps derived from OCT. For glaucoma detection, the predicted RNFLT-only classifier outperformed the fundus-only classifier (AUC 0.889 vs 0.883, p < 0.005; Accuracy 82.0% vs 78.0%), but performed worse than the real-RNFLT-only classifier (AUC 0.889 vs 0.903, p < 0.005). Multimodal fusion of fundus images with predicted RNFLT maps improved performance, achieving an AUC of 0.909, outperforming all single-modality inputs (p < 0.005 vs fundus-only, predicted-RNFLT-only, and real-RNFLT-only). Performance gains between the fundus-only and the multimodal classifier were greater in early-stage glaucoma compared to severe cases: accuracy increased from 55.3% to 64.0% in mild cases, from 71.5% to 80.4% in moderate cases, and from 90.0% to 94.6% in severe cases. Conclusions: Predicted RNFLT maps derived from fundus photographs provide quantitative, OCT-like structural information and improve glaucoma detection. Unlike prior work that predicted only summary RNFLT values, our model generates full RNFLT maps that better support glaucoma classification than fundus images alone. This approach offers a scalable pathway for early glaucoma screening and expands diagnostic access in resource-limited settings.

7

Atlas of Quality of Life in Binocular Visual Field Loss: A Comprehensive Study

Song, L.; Zha, L.; Lokhande, A.; Baek, J.; Wang, J.; Wang, M.

2026-06-03 ophthalmology 10.64898/2026.06.02.26354170 medRxiv

Top 0.1%

4.2%

Show abstract

Purpose: To quantify the binocular integrated visual field (IVF) loss patterns with archetypal (AT) analysis and their associations with patients' Quality of Life (QoL). Design: Retrospective study. Participants: Over 125,000 patients from three datasets from Massachusetts Eye and Ear and Glaucoma Research Network Consortium. Methods: We used: (1) the Glaucoma Research Network excluding the Massachusetts Eye and Ear subset for the binocular archetypal model training (77, 270 IVFs from 77 270 patients), (2) Massachusetts Eye and Ear dataset for demographic correlation analysis (47,965 IVFs from 47,965 patients), and (3) the MEE Quality of Life Survey dataset for QoL correlation analysis (75 IVFs from 75 patients). The whole study was restricted to the most recent VF measurements from each subject and binocular VFs were constructed by the integrated visual field method, which was taking the higher sensitivity at each test location. We first applied archetypal analysis to cluster 24-2 binocular VFs into archetypal patterns. The total number of patterns was determined by the Bayes factor. Pearson's correlations analyzed the associations between patients demographic information, binocular VF patterns and QoL scores, and the coefficients were set to 0 if p-values corrected by multiple comparisons < 0.05. Main Outcome Measures: A binocular VF archetypal patterns and its relationships with demographic divergences and QoL. Results: We identified 17 binocular VF loss patterns. Patterns with major vision impairment (AT10, AT12, AT13, AT14, and AT17) were more common in older patients, while Black or African Americans exhibited a broader spectrum of visual loss, notably AT5 and AT12, compared to Asian and White counterparts. 81 MEE patients with QoL survey data was analyzed to investigate the impact of demographic and vision-related variables on QoL. Older age and female gender were significantly associated with lower QoL. Binocular central vision loss (AT 5) and total vision loss (AT 12) had a significantly greater impact on QoL than binocular peripheral vision loss (AT 2, AT 5, AT 16). Conclusions: Individuals with central or total vision loss, as well as certain demographic groups, experience a significantly greater impact on quality of life. The quantifications of binocular VF loss patterns by archetypal analysis may help better understand glaucoma's impact on patients' quality of life.

8

Safety and Biological Activity of Intravitreal OGX110, a CXCR3 Agonist, in Persistent Neovascular Age-Related Macular Degeneration: A Phase I Dose-Escalation Study

Wells, A.; Boyer, D.; Goldberg, R.; Hohman, T.; Maturi, R.; Patel, S.

2026-05-30 ophthalmology 10.64898/2026.05.21.26353430 medRxiv

Top 0.1%

3.7%

Show abstract

Purpose: To evaluate the safety and exploratory outcomes of a single intravitreal injection of OGX110, a peptide agonist of CXCR3, in eyes with persistent fluid secondary to neovascular age-related macular degeneration (nAMD) despite ongoing anti-vascular endothelial growth factor (anti-VEGF) therapy. Methods: This prospective, open-label, sequential dose-escalation phase I study (NCT05904691) enrolled subjects receiving standard-of-care intravitreal anti-VEGF therapy. Subjects received a single intravitreal injection of OGX110 at 0.5 mg, 1.0 mg, or 2.0 mg (n=3 per cohort), 7 to 14 days after the anti-VEGF injection. Results: All nine enrolled subjects completed follow-up through day 56. Two subjects (22%) experienced at least 1 adverse event (AE); all were mild and unrelated to study treatment. Exploratory analyses showed a BCVA change of +1.4 letters following anti-VEGF injection and +4.4 letters from OGX110 baseline to 4 weeks (P < 0.05). Six of 9 subjects gained at least 3 ETDRS letters after OGX110. Anatomic responses were heterogeneous. Four eyes showed a reduction in CRT after anti-VEGF injection that was maintained after OGX110 administration. One additional eye demonstrated a substantial reduction in CRT after OGX110 despite minimal response to anti-VEGF treatment. Conclusions: A single intravitreal injection of OGX110 was well tolerated. Exploratory functional and anatomic findings suggest biologic activity; interpretation is limited by small sample size, open-label design, absence of a concurrent control group, and inter-subject heterogeneity. These results support further study in a controlled trial. Translational Relevance: OGX110 represents a mechanistically distinct investigational approach for nAMD that may warrant further evaluation in eyes with persistent.

9

Evaluating OCT Device-Reported Image Quality Score: Towards a Task-Specific Quality Gate for Deep Learning-based Outer-Retina and Choroid Boundary Segmentation

Gadari, A.; Vichare, A. A.; Corona, F.; Vupparaboina, S. C.; Lall, S. R.; Gregori, G.; Hasan, N.; Sahel, J.-A.; Chhablani, J.; Bollepalli, S. C.; Vupparaboina, K. K.

2026-05-20 ophthalmology 10.64898/2026.05.17.26353399 medRxiv

Top 0.1%

3.6%

Show abstract

Manufacturer-defined signal-strength indices are frequently employed as quality benchmarks for automated optical coherence tomography analysis, yet their empirical relationship with deep learning segmentation accuracy remains unclear. Because these metrics were originally developed for conventional image-processing pipelines, their ability to predict modern model-based segmentation accuracy has not been empirically validated. To address this gap, we evaluated the Heidelberg Spectralis Q-score against U-Net segmentation performance across 5,047 B-scans from 103 eyes for three anatomical boundaries of the posterior segment of the eye: the Ellipsoid Zone (EZ), Bruch's Membrane (BM), and Choroid Outer Boundary (COB). Alongside standard boundary agreement metrics (MAE, MSE, Dice Similarity Coefficient), we adapted the Earth Mover's Distance (EMD) from optimal transport theory as a boundary evaluation metric. Unlike column-wise averages, EMD quantifies boundary agreement as a 2-D geometric displacement, directly measuring residual spatial displacement between the model segmented boundary and the ground-truth boundary. Our results demonstrate that the Q-score - originally designed to gate image-processing-based automated analysis - is a poor predictor of deep learning boundary segmentation accuracy, with explained variance (R2) failing to exceed 1.4% across all three boundaries. We further observed a monotonically increasing error hierarchy with anatomical depth (EZ < BM < COB), consistent across metrics, which is unexplained by the signal strength. At the COB, correlations were paradoxically positive, explained by a B-scan-level mediation chain: higher Q-scores correspond to greater choroidal thickness (r=0.113, {rho}=0.158), which in turn predicts higher COB segmentation error (r=0.165, {rho}=0.191) - a localization difficulty that global signal strength cannot capture. Collectively, these findings challenge the implicit assumption that signal-strength-based quality thresholds are a reliable proxy for deep learning model performance, and motivate a shift toward task-specific acquisition quality criteria calibrated to model performance rather than signal interpretability.

10

Development and Pilot Testing of a Mobile App Psychosocial Intervention for Psychological Distress in Individuals with Glaucoma

Fisher, H. M.; Chou, N. A.; Falkovic, M.; Parnell, H.; Makarushka, C.; Fish, L. J.; Plumb Vilardaga, J.; Medeiros, F. A.; Somers, T. J.; Muir, K. W.; Berchuck, S. I.

2026-05-22 ophthalmology 10.64898/2026.05.20.26353674 medRxiv

Top 0.1%

3.1%

Show abstract

Objective: To assess the feasibility and acceptability of VISON-ACT, a standalone, mobile app psychosocial intervention for psychological distress in individuals with primary open-angle glaucoma (POAG). Design: Single-arm pilot. Participants: Patients (N=28) with a diagnosis of POAG, self-reporting at least mild (>3) distress on the 4-item Patient Health Questionnaire, were recruited from the Duke Eye Center between April 2025-December 2025. Methods: Patients (n=28) were consented and completed a baseline (A1) self-report assessment. VISION-ACT was comprised of 6 weekly modules. Follow-up self-report assessments occurred at post- (A2) and 1-month post-intervention (A3) and included measures of psychological distress, vision and health-related quality of life, psychological flexibility, disease acceptance, self-efficacy for symptom management, mindfulness, and social support. Participants were invited to complete an exit interview at 1-month post-intervention to gather qualitative feedback on the VISION-ACT protocol. Descriptive statistics were used to assess feasibility and acceptability metrics and patterns of pre-post change on patient reported outcomes were explored with linear mixed mdels using R Statistical Software. Main Outcome Measures: Feasibility (target accrual (n=25) in 12 months, <20% attrition at post-intervention); Acceptability (>75% reporting use of VISION-ACT skills or ideas at post-intervention, >80% reporting M>3.00/4.00 at post-intervention on the Client Satisfaction Questionnaire); Psychological Distress (Hospital Anxiety and Depression Scale [HADS], Subjective Units of Distress Scale [SUDS]). Results: VISION-ACT was highly feasible; accrual target was surpassed (N=28) in 6 months, and attrition was low (3.85%) at post-intervention (A2). Acceptability was strong with 100% of participants reporting use of VISION-ACT skills or ideas at A2 and M=3.27/4.00 intervention satisfaction. Adherence was remarkable with 88.5% of participants completing all six VISION-ACT modules. Pre-post change patterns were in the expected direction for psychological distress (HADS A1 M=13.88, A2 M=11.21; SUDS A1 M=35.54, A2 M=26.46) and all other patient-reported outcomes across baseline, post- and 1-month post-intervention assessments. Data on participant perspectives highlighted valuable aspects of VISION-ACT, and areas for refinement. Conclusions: Robust feasibility and acceptability data seen here provide support a fully-powered, randomized trial to evaluate the efficacy of VISION-ACT for reducing psychological distress and improving related patient-reported and clinical outcomes.

11

Deep Learning Prediction of Personalized Peripapillary Retinal Nerve Fiber Layer Thickness Norms from Fundus Images in Glaucoma

Yildiz, E.; Zha, L.; Zebardast, N.; Shi, M.; Wang, M.

2026-05-27 ophthalmology 10.64898/2026.05.26.26354081 medRxiv

Top 0.1%

2.7%

Show abstract

Purpose: To predict retinal nerve fiber layer thickness (RNFLT) norms from fundus images. Methods: We selected 18,000 OCT scans and visual fields (VF) from the Massachusetts Eye and Ear Glaucoma Service. A U-Net-based deep learning model was developed to predict RNFLT norms from OCT en face fundus images. A total of 10,000 OCT scans with normal VFs (mean deviation [MD] [≥] -1 dB, glaucoma hemifield test within normal limits, and pattern standard deviation probability > 5%) tested within 30 days were used for training, while the remaining 8,000 OCT scans (mean VF MD: 3.3 +/- 4.9 dB), including 2,419 scans with normal VFs, were used for evaluation. Structure-function correlations between RNFLT maps and VFs were assessed using linear regression and VGG-16 across original RNFLT maps, deviation maps, and their combination. Performance was evaluated using correlation coefficients, mean absolute error (MAE), and R-squared. Results: Predicted RNFLT norm maps showed agreement with baseline RNFLT maps in eyes with normal VFs (R-squared = 0.81 +/- 0.13). RNFLT deviation maps correlated more strongly with VF MD than original RNFLT maps (R = 0.42 vs. 0.19, p < 0.01). In deep learning-based VF prediction, combining original and deviation maps achieved the best performance (MAE = 3.31 dB, R-squared = 0.39), outperforming the model (p < 0.05) using original RNFLT maps alone (MAE = 3.36 dB, R-squared = 0.35). Conclusions: Deep learning can estimate individualized RNFLT norms and improve structure-function assessment in glaucoma. Translational Relevance: Personalized RNFLT norm prediction may improve detection of glaucomatous damage.

12

GWAS Meta-analysis Identifies Novel Associated Loci and Points to Causal Tissues in Central Serous Chorioretinopathy

Chen, L.; Kim, S. H.; Truong, B.; Rämö, J. T.; Gorman, B. R.; van Dijk, E. H. C.; Brinks, J.; Nikopensius, T.; Choi, S. H.; Kajanne, R.; Mehtonen, J.; Kaarniranta, K.; Sobrin, L.; Kurki, M.; Yzer, S.; VA Million Veteran Program, ; FinnGen, ; Wu, W.-C.; Turunen, J. A.; Segre, A. J.; Mercader, J. M.; Huerta, A.; Daly, M. J.; Palotie, A.; Ellinor, P. T.; Boon, C. J.; Iyengar, S. K.; Peachey, N. S.; Natarajan, P.; Rossin, E. J.

2026-05-22 ophthalmology 10.64898/2026.05.20.26353693 medRxiv

Top 0.1%

2.4%

Show abstract

Objective: To define CSC genetic architecture and identify implicated ocular tissues, cell types, genes, and circulating proteins. Data Sources: Genome-wide data were assembled from FinnGen, All of Us, Mass General Brigham Biobank, Million Veteran Program, and a Dutch chronic CSC cohort. Serum protein quantitative trait loci, human single-cell ocular atlases, and UK Biobank macular optical coherence tomography (OCT) imaging were used for downstream analyses. Study Selection: Five European-ancestry cohorts with genome-wide data and cohort-specific CSC case-control definitions were included, comprising 2,584 cases and 1,044,455 controls. Variants present in at least 2 cohorts were meta-analyzed. Data Extraction and Synthesis: Cohort-level GWASs were adjusted for age, age squared, sex, genotyping array or batch, and 10 genetic principal components, then combined using fixed-effects inverse-variance meta-analysis. Post-GWAS analyses included gene prioritization, colocalization, Mendelian randomization, single-cell disease-relevance scoring, and testing of a CSC genetic risk score in UK Biobank OCT images. Main Outcome(s) and Measure(s): Genome-wide significant CSC loci, effector genes and proteins, tissue and cell-type enrichment, and CSC-relevant OCT abnormalities. Results: Across 11,068,938 variants, 10 loci reached genome-wide significance (P < 5e-8), including 3 novel loci near TGFB1, LINC00551, and LOC105375630 and 7 replicated loci near CFH, CD46, NOTCH4, PREX1, PTPRB, GATA5, and TNFRSF10A. Integrative analyses prioritized 10 candidate effector genes. Colocalization and Mendelian randomization implicated circulating TNFRSF10A, TGFB1, and CASP10 levels. Single-cell analyses localized genetic risk to sclera (P = 2.0e-4) and vascular endothelial cells (P = 4.0e-4), with fibroblast enrichment. In UK Biobank, OCT abnormalities were more frequent in the top vs bottom 1% of CSC genetic risk (18 of 109 [16.5%] vs 8 of 134 [6.0%]; odds ratio, 4.05; 95% CI, 1.65-10.87; P = .002). Conclusions and Relevance: In this GWAS meta-analysis, CSC susceptibility localized predominantly to scleral and vascular biology rather than primary retinal pigment epithelial dysfunction. These findings support CSC as a sclerovascular disorder and nominate complement regulation, endothelial signaling, and extracellular matrix pathways for future study.

13

Can Artificial Intelligence Match Dermoscopy in Melanoma Detection? Evidence from a Systematic Review and Meta-analysis of Pigmented Skin Lesions

Tang, H.; Zhu, Y.; Diao, M.

2026-05-20 dermatology 10.64898/2026.05.15.26353363 medRxiv

Top 0.2%

2.1%

Show abstract

Accurate risk stratification of pigmented skin lesions is critical for early melanoma detection and for reducing unnecessary excisions. Artificial intelligence (AI) is increasingly applied to dermoscopic image analysis, but its diagnostic performance relative to standard dermoscopy in real-world clinical settings remains uncertain. To address this gap, we conducted a systematic review and meta-analysis of prospective clinical studies directly comparing AI alone, dermoscopy, and AI-assisted clinicians for malignancy risk assessment of pigmented skin lesions. We systematically searched PubMed, Embase, Web of Science, and Cochrane Library from inception to January 2026. Ten studies with 17 diagnostic arms (10 dermoscopy arms, 6 AI-alone arms, and 1 AI-assisted clinician arm) were included. Pooled sensitivity and specificity were 0.773 (95% CI, 0.648-0.863) and 0.793 (95% CI, 0.673-0.877) for dermoscopy, and 0.757 (95% CI, 0.428-0.928) and 0.859 (95% CI, 0.619-0.958) for standalone AI. Summary ROC curves showed overlapping performance, indicating that autonomous AI is broadly comparable to dermoscopy but does not demonstrate a consistent advantage. Heterogeneity in AI performance was driven almost entirely by threshold effects rather than by differences in inherent model capacity. AI-assisted clinicians showed promising results (sensitivity 1.000, specificity 0.837) in a single study, but more evidence is needed. Our findings suggest that, at present, AI should be viewed as a complementary decision-support tool rather than a replacement for dermoscopic evaluation. The study provides valuable evidence for clinicians, guideline developers, and researchers working on AI integration into melanoma diagnostic pathways.

14

Inter-relationship of Retinal, Choroidal, and Scleral Thickness in High Myopia

Panigrahi, S.; Dhakal, R.; Vupparaboina, K. K.; Verkicharla, P. K.

2026-05-17 ophthalmology 10.64898/2026.05.13.26353083 medRxiv

Top 0.2%

1.9%

Show abstract

Purpose Considering that myopia is associated with thinning of the ocular coats, this study investigated the inter-relationship of retinal, choroidal and scleral thickness in foveal regions in Indian high myopes. Methods A total of 23 high myopes (spherical equivalent refraction [≤]-6.00D) aged 16 to 35 years underwent posterior segment imaging with swept-source optical coherence tomography. The retinal, choroidal and scleral thickness was determined using semi-automated custom-designed software at sub-foveal regions. Axial length was determined using Lenstar LS 900 non-contact biometer. Results The mean plus-or-minus sign SD axial length was 30.17 plus-or-minus sign 2.23 mm, sub-foveal retinal thickness was 245 plus-or-minus sign 28 lower case Greek mum, sub-foveal choroidal thickness was 82 plus-or-minus sign 46 lower case Greek mum, and sub-foveal scleral thickness was 254 plus-or-minus sign 68 lower case Greek mum. The choroid was significantly thinner compared to the retina and sclera (p<0.001). With a 1 mm increase in axial length, there was no significant variation in sub-foveal retinal (increased by 0.86 lower case Greek mum) and scleral thickness (decreased by 4.31 lower case Greek mum, p[≥]0.05), but sub-foveal choroidal thickness decreased by 10.35 lower case Greek mum (p=0.02). For a 1D decrease in spherical equivalent refraction, the choroidal thickness reduced significantly (decreased by 5.88 lower case Greek mum, p<0.001), while there was no significant variation in retinal (decreased by 0.68 lower case Greek mum, p=0.55) and scleral thickness (increased by 0.13 mum, p=0.98). The association of the sub-foveal retinal, choroidal, and scleral thickness was weak and was not significant in high myopes (p[≥]0.10). Conclusions With increasing axial length and severity of myopia in high myopes, compared to scleral and retinal thickness, the choroidal thickness alone decreased significantly. Our findings indicate that the changes in the choroid do not necessarily reflect the changes in retinal and scleral thickness and highlight the importance of the choroid as a marker for axial elongation even in high myopes.

15

Case-level artificial intelligence for multi-photo teledermatology submissions: development and internal validation using patient-submitted dermatology images

Patel, V. P.; Sheth, N.; Patel, A.; Patel, Y.

2026-06-01 dermatology 10.64898/2026.05.21.26353816 medRxiv

Top 0.2%

1.9%

Show abstract

Background: Store-and-forward teledermatology commonly relies on several patient-submitted photographs of the same concern, but most dermatology artificial intelligence models classify single images independently. Objective: To develop and internally validate a case-level diagnostic-support model that aggregates multiple patient-submitted photographs for common dermatologic conditions. Methods: We conducted a retrospective diagnostic-modeling study using the Skin Condition Image Network, a public dataset of deidentified self-taken dermatology images from US adults. We curated 2,336 cases comprising 5,041 images across 10 common inflammatory, allergic, and infectious conditions. Cases were split at the submission level into training, validation, and held-out test sets. Frozen general-purpose and dermatology-specific encoders were compared with image-level classifiers and a gated-attention multiple instance learning model that generated one case-level output from 1-3 images. Results: The strongest image-level baseline, dermatology-specific embeddings with random forest classification, achieved macro/micro ROC-AUCs of 0.797/0.854. Case-level aggregation improved discrimination, with dermatology-specific embeddings plus multiple instance learning achieving mean macro/micro ROC-AUCs of 0.819/0.863 across repeated stratified experiments. The locked final model achieved macro/micro ROC-AUCs of 0.800/0.849 on the held-out test set. Balanced-threshold sensitivity/specificity examples were 0.702/0.688 for eczema and 0.818/0.826 for urticaria. Limitations: Internal validation used a 10-condition subset from a US volunteer dataset; external validation, calibration, subgroup performance analysis, and prospective workflow studies are required. Conclusion: Modeling the teledermatology submission as a multi-image case better reflects asynchronous dermatology workflow than single-image classification. The model is preliminary clinician-facing support for structured review and triage, not autonomous diagnosis.

16

AI Decision Support for Challenging Teledermatology Cases: MedGemma Performance in the Dermatology ECHO Program

Appiagyei, J. B.; Otu, R. O.; Henry, M. K.; Casterline, B. W.; Becevic, M.

2026-05-26 health informatics 10.64898/2026.05.21.26353523 medRxiv

Top 0.2%

1.7%

Show abstract

Teledermatology expands access to dermatologic expertise in rural settings, yet diagnostic uncertainty persists in low-resource primary care. This retrospective study evaluated MedGemma-4B-IT, a compact multimodal vision-language model, as adjunctive clinical decision support for challenging diagnostic cases. We analyzed 77 zero-concordance cases (360 clinical photographs) from a Dermatology Extension for Community Healthcare Outcomes (ECHO) tele-mentoring program (2016-2021). Zero-concordance cases showed no overlap between primary clinician provisional diagnosis and dermatologist-confirmed diagnosis. The model was prompted using dermatologist-style format to generate ranked differential diagnoses. Performance was assessed using strict case-level top-k exact-match accuracy and relaxed matching criteria based on fuzzy string similarity. MedGemma achieved 0.0% strict top-1 accuracy, 1.3% top-3 accuracy, 3.9% top-5 accuracy, and 3.9% top-10 accuracy. Relaxed concept-level matching achieved 28.6% top-1, 63.6% top-5, and 67.5% top-10 accuracy. Image-level accuracy was 44.2% (159/360, 95% CI 39.0-49.5%). The model surfaced the correct diagnosis within differential lists in 45.5% of cases despite no exact top-1 matches, suggesting utility for differential expansion rather than definitive diagnosis. Performance varied across diagnostic categories, with highest accuracy in Other categories (54.5%) and lowest in neoplastic conditions (0.0%). Common errors included confusion between inflammatory and other diagnostic groupings. These findings characterize MedGemma performance on real-world teledermatology cases and inform safe, clinician-in-the-loop integration into teledermatology workflows where specialist oversight remains essential.

17

Home-based binocular serious games in virtual reality to treat visual acuity and stereovision in residual amblyopia: AMBER study

Aurilia, A.; Martin, N.-L.; Simon-Martinez, C.; Antoniou, M.-P.; Bouthour, W.; Bavelier, D.; Backus, B. T.; Dornbos, B.; Blaha, J. J.; Kropp, M.; Muller, H.; Murray, M. M.; Thumann, G.; Steffen, H.; Matusz, P. J.

2026-06-12 ophthalmology 10.64898/2026.06.12.26355255 medRxiv

Top 0.2%

1.7%

Show abstract

Objectives: Amblyopia is a pediatric visual disorder traditionally treated by patching the fellow eye, though many patients retain residual amblyopia post-treatment. Increasing evidence suggests that visual plasticity allows treat-ment beyond the classical therapeutic window. AMBER evaluated the efficacy of binocular serious games in virtual reality (VR) in residual amblyopia. Methods and Analysis: The monocentric, prospective, randomized, crossover trial (reported as case series) includ-ed 14 anisometropic, strabismic, or mixed residual amblyopia patients (6-35 years; 5 children, 9 adults). Participants underwent two 2-month intervention phases: optical correction (standard care) and standard care plus VR games (2.5 h/week), each with a 2-month follow-up. Best-corrected visual acuity (BCVA), stereoacuity, and reading speed were assessed (5 timepoints) using the Sloan and Landolt charts, the Titmus, TNO, Lang II, Asteroid, and Mnread tests. Compliance and adverse events (AE) were recorded. Results: VR training improved BCVA in 10 amblyopic eyes (Landolt and Sloan), with more pronounced effects in anisometropic patients. Six patients showed improved stereoacuity (Titmus; 4x mixed, 1x anisometropic, 1x stra-bismic amblyopia), persistent only in children (1x strabismic, 1x mixed amblyopia). Four improvements were ob-served with TNO (1x), Lang II (1x), Asteroid (0x), and MNread (1x). Despite positive trends, when comparing re-sults of individual patients, between both eyes, and with standard treatment, consistency of improvements cannot be conclusively demonstrated. One non-severe AE (dizziness) was reported. Conclusions: Following individual cases, VR training improved BCVA and stereoacuity, particularly in children and patients with high compliance. However, considering the cohort as a whole, consistency of effects has to be confirmed in larger groups. Thus, the methodologically sophisticated AMBER study revealed differences in VR treatment efficacy between amblyopia types, children/adults, endpoints and tests, offering precious data for the design of meaningful future studies. It shows that neurovisual plasticity gauged by VR-games offers safe, engaging treatment options for residual amblyopia.

18

Prevalence and pattern of refractive errors among Yanomami Indigenous people in the Brazilian Amazon: a cross-sectional observational study

Chagas Ferreira, M. C.; Pellegrini, M. A.; Sequeira, B. J.

2026-05-26 ophthalmology 10.64898/2026.05.25.26354064 medRxiv

Top 0.2%

1.4%

Show abstract

Background: Refractive errors are the leading cause of preventable visual impairment worldwide, yet data from isolated Indigenous populations remain virtually absent from the global literature. The Yanomami, one of the largest Indigenous peoples in the Americas with recent and limited contact with non-Indigenous society, have no prior epidemiological data on refractive errors. Methods: A cross-sectional observational study was conducted in 2024 at the Yanomami Indigenous Health House, Boa Vista, Roraima, Brazil. A total of 158 self-identified Yanomami individuals aged 5 years or older were examined by an ophthalmologist. Refractive status was classified according to International Myopia Institute criteria. Results: Emmetropia was observed in 67.7% of participants, with a marked age-related decline from 100% in children aged 5 to 9 years to 38.6% in those aged 40 to 59 years. Myopia was present in 16.5% of participants, all low myopia; it was absent in children under 10 years and no high myopia was identified. Astigmatism affected 24.1% of participants and hyperopia 13.3%. Presbyopia was identified in 25.9%. Overall, 25.3% of participants presented with reduced visual acuity attributable to uncorrected refractive error, of whom 67.5% improved to normal or near-normal acuity (p < 0.001). Conclusions: This is the first characterisation of the Yanomami refractive profile, revealing a distinct myopia pattern shaped by high outdoor exposure and minimal near-work demands. Despite this, refractive correction remains effectively inaccessible to this population, leaving preventable visual impairment unaddressed and reflecting a profound health inequity. Corrective lens provision represents a high-impact, scalable intervention for this underserved community.

19

Comparison of early ocular biological parameters in preterm infants with or without Retinopathy of Prematurity

Ma, P. P.; Wu, Q.; Xin, W.; Zhang, L.

2026-05-18 ophthalmology 10.64898/2026.05.14.26353221 medRxiv

Top 0.2%

1.3%

Show abstract

Abstract Purpose:Comparison of ocular parameters (ACD, AL, LT, VL, CCT, ASD, LC, LT/ACD) in preterm infants with retinopathy after treatment, those with spontaneous regression, and those without retinopathy, at postmenstrual (ages of 0 (40 weeks), 3 , and 6 months. Methods: Cross-sectional study. This research involved 297 premature infants assigned to three groups based on fundus results and intravitreal injection therapy: an ROP post-injection group, an ROP spontaneous regression group, and a non-ROP group. Axial length (AL), anterior chamber depth (ACD), l e n s t h i c kn e s s (LT), and vitreous length (VL) were assessed in all three groups using a corneal thickness meter at po st menstrual age s (PMA) of 0, 3, and 6 months. Derived parameters--ASD ((ACD + LT), LC ((ACD + LT/2), and LT/ACD--were subsequently calculated. A one-way ANOVA analysis revealed statistically significant differences in these ocular parameters among the groups (P < 0.05). Results: Significant differences e m e r g e d in anterior chamber depth (ACD) and l e n st h i c k n e s s ( LT) between the ROP post-injection group, ROP spontaneous regression group, and non-ROP group at 0, 3, and 6 (months postmenstrual age (PMA). At 0 months PMA: ACD(F=4.33, P=0.014), LT (F=5.45, P=0.005). At 3 months PMA: ACD (F=17.20, P<0.01), LT(F=15.23, P<0.01). At 6 months PMA: ACD (F=17.89, P<0.01), LT (F=17.21, P<0.01). Central corneal thickness (CCT) also differed significantly among the three groups at 0 months PMA(P <0 .0 1 ). All ocular parameters correlated significantly with Postmenstrual Age, with CCT and LT showing a negative correlation. Before 6 months PMA, axial length (AL) and vitreous length (VL) increased significantly, and ACD deepened significantly across all groups (P <0 .05 ). However , LT exhibited no significant change within the ROP group (post-injection group P=0.4; spontaneous regression group P=0 .33). No significant differences existed in any ocular parameters between the ROP post-injection group and the ROP spontaneous regression group (P>0.05). Conclusions: Before 6 months of postmenstrual age (PMA), axial length (AL), vitreous length (VL), and anterior chamber depth (ACD) were increased between the ROP group and non-ROP group; lens thickness (LT) remained unchanged in the ROP group but increased in the non-ROP group. The injection group and the spontaneous regression group showed no significant differences. The primary factors influencing anterior segment development were birth weight (BW), gestational age (GA), and postmenstrual age (PMA).

20

Exploring the Interpretability of AI Decision Support Systems for Surgical Anatomy Recognition

Khan, D. Z.; Adams, T.; Wijekoon, A.; Ramirez Herrera, R.; Bano, S.; McCulloch, P.; Stoyanov, D.; Clarkson, M. J.; Costanza, E.; Blandford, A.; Marcus, H.; CARES Evaluation Group,

2026-06-03 surgery 10.64898/2026.06.02.26354729 medRxiv

Top 0.2%

1.3%

Show abstract

Artificial intelligence (AI) decision support systems for surgery hold promise but face barriers to adoption, particularly around the interpretability of their outputs. We conducted an international cross-sectional survey of 47 neurosurgeons to evaluate perspectives on literature-derived explanation techniques for AI-generated anatomical segmentations, using endoscopic pituitary surgery as a high-risk exemplar. Participants ranked certainty scores, certainty maps, saliency maps, scene similarity scores, and nearest-neighbour illustrations, and rated them using a modified Explanation Satisfaction Scale alongside free-text feedback. Certainty-based techniques were consistently ranked and rated highest for interpretability - valued for aligning with surgical decision-making by conveying confidence (via scores) and anatomical boundaries (via maps). Saliency- and similarity-based methods were judged less clinically relevant and better suited to educational settings. Certainty-based explanations, therefore, appear most acceptable to surgeons for clinical integration of decision support systems, though their impact on AI acceptability, trust calibration, and performance requires prospective evaluation across surgical domains.